

\section{Approval} % phase
\label{sec:approval}

We now describe the subprotocol that provide the approval validity checks required before finality.  


We preserve the previous notation from \S\ref{sec:backing} and \S\ref{sec:availability}:  

Any parachain block $B$ on a parachain $\rho$ has a candidate proof-of-validity blob $\blobB$ that contains the witness proofs for any parachain state interactions, as well as an associated candidate receipt $\receipt_B$ that contains the Merkle root $\merkleroot_B$ for the Merkle tree with leaves consisting of the erasure coded pieces $\prepieces_B$ of $\blobB$.  We distributed the authenticated pieces $\pieces_B$ that consist of one piece from $\prepieces_B$ and its witness copath for $\merkleroot_B$.  

We handle attested candidate receipts $\receipt_{B,S}$ that consist of its abridges candidate receipt $\receipt_B$ and validity attestations.  We first in \S\ref{sec:backing} posted candidate receipts $\receipt_{B,S}$ on chain once they received sufficient backing.  We then in \S\ref{subsec:availability} accumulated on-chain the availability attestations for them.  We finally recognized availability on-chain once this reached $2f+1$, which determines when candidate parablocks actually interacts wit the relay chain. 

In this section, we consider a relay chain block $R$ in which several candidate receipts $\receipt_{B,S}$ were declared available and executed.  In \S\ref{sec:backing}, we freely announced $\receipt_{B,S}$ on chain without impacting relay chain validity because doing so interacted with nothing.  As $B$ executes in $R$ however, we cannot consider $R$ valid unless we consider $B$ valid, but so far we have validity attestations for $B$ only from $S \subset \vals_\para$, and $\vals_\rho$ is a tiny and dangerously foreseeable set of validators.  We discuss here the approval phase for $R$ in which an unforeseeable selection of approval checkers provide attestations.

The approval phase for $R$ has two components, approval checker assignments and actually performing the checks.  We have three flavours of approval checker assignments, which all roughly follow the pattern of
\begin{itemize}
\item first VRF announcements via gossip, and
\item then computation of approval checker assignments.
\end{itemize}
\noindent 
All approval checker assignments trigger the same approval check protocol, which consists of
\begin{itemize}
\item retrieval and reconstruction of full candidate proof-of-validity blobs, 
\item running approval checks on the candidate proof-of-validity blobs, and 
\item announcing these candidates validity or invalidity.
\end{itemize}
We describe this second component first because ???why???.

In spirit, candidate validity attestations could come from any validator, and might ideally be saved, due to being slashable, but we only interpret these signatures as relevant towards specific goals depending upon the validator's role, like backing or approval, when they come from nodes with the appropriate role assigned.  We shall discusses invalidity claims and unavailability complaints by non-validators, like collators and fishermen, in the next section.


\subsection{Checking}
\label{sec:approcal_checks}

In this subsection, we consider a validator $V$ who passed some approval checker assignment criteria subphase:  First, $V$ has constructed and announced via gossip some approval checker VRF signature $\pi_{V,\cdot}$.  Second, there exists a parachain $\rho$ determined by the specific approval checker assignment criteria applied to $\VRF.\Out(\pi)$.  We let $B$ denote the parachain candidate $B$ for $\rho$ in $R$ and $\receipt_{B,\cdot}$ its candidate receipt.  

We need only information posted on-chain to validate the VRF signatures $\pi_{V,\cdot}$ for approval checker assignment criteria, but some assignment criteria employ some off-chain information.  We employ absence from the chain occasionally, which seems unproblematic.  We might package equivocation proofs (see below) with the VRF signatures, depending upon their size.  We always prevent late stage adversarial influence from shifting approval checker assignments by making assignments strictly additive, and always in batches.
%
%TODO: Move paragraph where?


\subsubsection{Retrieval}
\label{sec:retrieval}

We never consider substructure of $B$ to be meaningful, so $V$ must {\em retrieve} the full {\em candidate proof-of-validity blob} $\blobB$ before checking.  Now $V$ knows which which nodes have their individual pieces, thanks to their availability announcements.  It thus follows from our $2f+1$ honesty assumption that $V$ could always reconstruct $\blobB$ by obtaining $f+1$ pieces $\pieces_B$ from nodes known to posses them, with at most $f$ dishonest nodes not responding.  

We note however that $V$ also knows that all pieces are known by the preliminary backing validity checkers aka parachain validators who approved $\blobB$, as well as approval checkers who already approved $\blobB$.  So $V$ could first contact some node that possesses all of $\blobB$, and only then begin a full reconstruction process. 

In both cases, $V$ must recompute $\pieces_B$ to verify $\receipt_{B,\cdot}$.  We therefore cannot see much computational difference between $V$ reconstructing $\pieces_B$ from arbitrary pieces or from $\blobB$ itself.  It remains plausible $V$ avoids some networking overhead by asking for $\blobB$ though.  We think a first implementation should reconstruct $\pieces_B$ from arbitrary pieces, while leaving requests for the full $\blobB$ as future optimisations. 

Ideally $V$ might retrieve the pieces in $\pieces_B$ only using its existing connections in our topology specified above, except these intentionally do not include 1/3rd of validators.  Also, $V$ need not connect to any node with all of $\pieces_B$. 
Interestingly, $V$ already connects to at least one prachain validator in $\vals_\rho$ who often checked $B$ first, which indicates that transferring the full blob $\blobB$ may prove optimal.  

We caution against abandoning approval checkers over topology concerns however because then adversarial influence over the topology could wreck our assignment criteria below.

Also, our retrieval component could be engineered to avoid requests entirely:  After obtaining $\pi_{V,\cdot}$, another validator $V'$ could simply compute its own priority for sending its piece from $\pieces_B$ to $V$.  We caution that doing do might become inefficient, either because $V$ winds up rejecting sends, or when many nodes go offline.  

%TODO: How much linkage with the topology section?

\subsubsection{Announcement}

After $V$ obtains enough pieces to recompute and verify $\receipt_{B,\cdot}$ from $\blobB$, then $V$ verifies $\blobB$ itself by running the execute block function of $\rho$.  If both these checks pass, then $V$ signs $\receipt_{B,\cdot}$ and announces its signature via gossip.  If either check fails, then $V$ claims invalidity for $\receipt_{B,\cdot}$, which results in all validators checking $B$.


\subsection{Assignments}
\label{sec:assignment}

\newcommand\netdelay{\ensuremath{\Delta_{\mathsf{net}}}}
\newcommand\checkdelay{\ensuremath{\Delta_{\mathsf{check}}}}

We consider self-assignment schemes determined by several paramaters:
\begin{itemize}
\item our target number $\ell$ of eventual checkers,
\item our total number $n'$ of candidate checkers, normally $n' = \nvals - \npvals$,
\item the weight $\theta$ of a no-show checker with $1 \leq \theta \leq 2$,
\item a network delay bound $\netdelay$ for gossip messages, and
\item a block checking delay bound $\checkdelay$ that covers both retrieval and validation.
\end{itemize}
We never split the check bound $\checkdelay$ into separate retrieval and validation bounds, because doing so could only improves latency in extremely niche situations, and maybe the extra gossip costs more. 

\subsubsection{Verifiable random sampling}

We propose self-assignment schemes based on {\em verifiable random sampling} using a VRF:  

Any protocol run requires some seed $z$ and some associated time $T_0$ that marks the protocols' start time.  As a rule, we select $z$ to satisfy some freshness assumption, meaning our adversary learns $z$ before some time $T_0$ with odds less than our ambiant security treshold $f/n$.

Any validator $V$ with key pair $(\pk,\sk)$ computes its VRF signature $\tilde{\pi}_{V,z} := \VRF.\Sign_\sk(z)$, which it announces when specified by the specific protocol, but not before $T_0$.  Anyone else verifies $\tilde{\pi}_{V,z}$ with $\pk$.  All participants seed their sampling procedure for $V$'s assignment with the verifiable randomness $\tilde{\omega}_{V,z} := \VRF.\Out(\pi_{V,z})$.

We prefer $z$ already be known to verifiers, but our protocols run far too fast to enforce this.  We expect premature gossip messages that arrive before the verifier knows $z$ require subtle caching and expiration policies, but some scenarios provide a short proof for $z$, which may or may not be worth including in gossip messages.

We remark that gossip messages could be signed with $\pi$ itself by incorporating the output into the full gossip message, which gets signed as extra message data.  If done, this reduces the computational cost for signature verification by a third to a half, but may violate protocol layering and complicate validating protocol security and correctness, audits, etc. 

We always need a target number $\ell$ of eventual checkers when sampling assignments based on $\tilde{\omega}_{V,z}$.  We shall often alter $\ell$ during protocol runs though, so assignments must be covariant in $\ell$, meaning increasing $\ell$ should not invalidate assignments made with smaller $\ell$.  We avoid rejection sampling after $\ell$ enters the sampling computation for this reason.

\subsubsection{Generic one-shot delay}

We first define a generic ``one-shot'' self-assignment subprotocol in which the seed $z$ uniquely identifies a parachain candidate $B$ on some parachain $\rho$.  We term this a ``one-shot'' protocol in that validators reveal their assignment at the last possible moment before checking, which costs us efficiency but gives code simplicity and completeness.  It also gives us a fig leaf of defense against some adaptive adversary classes.  We describe this subprotocol generically because it acts as a basis for several self-assignment subprotocols, thanks to being one-shot, and should enable considerable code reuse.  

All validators track the set $A_z$ of valid announced checker claims, as well as the subset $S_z$ of all validity claims $S_B$ for $B$ generated by nodes in $A_z$.  We identify here the generic one-shot protocol run by $z$ because multiple $z$ may imply the same $B$, except we emphasise that messages exist only for $S_B$ not $S_z$, and that our protocol run ends by considering $S_z$ sufficient.  

We uniformly sample discrete time delays from $\{ \netdelay, \netdelay {n' \over \ell} \}$, so that an expected $\ell$ receive delay $\netdelay$.  We use verifiable sampling here of course, meaning we compute this delay $d_{V,z}$ from $\tilde{\omega}_{V,z}$.  

As noted above, we shall alter $\ell$ during protocol runs, which prevents involving it too directly.  Instead, we let $\ell_0$ denote the initial value of $\ell$ to define 
$$ d_{V,z} = \netdelay \lfloor {n' \over \ell_0 m} \omega_{V,z} \rfloor $$
where $m$ is the upper bound on $\omega_{V,z}$, i.e.\ $\omega_{V,z} \in [0,m)$.  Integer division without rejection sampling suffices because checker with high delays never check or even announce their values.
%
In this, we implicitly condense checking assignments into discrete delay tranches.  In particular, all checkers with $d_{V,z} = 0$ check at time $T_0$, which prevents powerful adversaries from forcing nodes before them. 
%
In Rust, we might sample with code resembling
\begin{verbatim}
fn generic_oneshot_assigned_delay(
    ctx: &'static [u8],
    network_delay: u32, 
    candidates: u32,
    target: u32,
    omega: ::schnorrkel::vrf::VRFInOut,
) -> u32 {
    assert!(network_delay > 0); // Unrealistic network otherwise
    assert!(candidates > 0 && target > 0); // No candidates, not even us.
    let mut s = u32::from_le_bytes( omega.make_bytes(ctx) ) as u64;
    s *= (candidates as u64) * (network_delay as u64);
    s /= (u32::value_max() as u64) + 1;
    s.saturating_sub(network_delay as u64)
    .try_into::<u32>().expect("candidates * network_delay > u32::value_max()")
    / target
}
\end{verbatim}
%TODO: Rewrite code

Consider a validator $V$ with current time $t$.  We describe how $V$ tracks the announcements $A_z$ and their answering validity claims $S_z$, and deals with no-shows.

$V$ records the time $t_a$ it receives each checker announcement $a \in A_z$.  Also $V$ defines its no-show count $\ell'$ to be the number of $a \in A_z$ for which both $t_a + \checkdelay < t$ and the owner $V'\in \vals$ of $a$ does not yet appear in $S_z$.  As input, we take a no-show suspicion factor $\theta'$ with $0 \le \theta' \le 2$, by which we multiply $\ell'$ below.

Any validator $V$ announces via gossip their $\tilde{\pi}_{V,z}$, and begins retrieving and checking $B$, whenever
\begin{enumerate}
\item $T_0 + d_{V,z} + \netdelay < t$,
\item $|A_z| < \ell + \theta \ell'$, and
\item $V \notin \vals \setminus \vals_\rho$.
\end{enumerate}
In some protocols, we do increase $\ell$ to compensate for absent parachain validators in $\vals_\rho$, so revealing a validity claim from a $V \in \vals_\rho$ may reduce $\ell$, but such a claim never counts towards $S_z$.

Also $V$ considers the block $B$ valid whenever $|S_z| \ge \ell + \theta \ell'$, but one-shot subprotocol cannot terminiate themselves because their caller might later increase $\ell$.  We ask the caller manage prioritisation for work on $z$ and $B$.

% \subsection{Criteria}
%
% $\equivocationchecks$
% $\basesecondarychecks$

\subsubsection{Initial:}

% TODO: Fluff section, so move elsewhere?

We recall that a relay chain block $R$ always has an associated VRF signature $\hat{\pi}_R$ whose output $\hat{\omega}_R := \VRF.\Out_(\hat{\pi}_R)$ gets recycled for the random beacon, and perhaps seals the block $R$ itself.  In Praos \cite{Praos} or BABE \cite{BABE}, $\hat{\omega}_R$ equals the VRF output $\omega_R$ that justifies creating $R$, but secret single leader elections like Sassafras \cite{Sassafras} might reveal $\omega_R$ early.  We want $\hat{\omega}_R$ to be the random beacon source that only the block producer of $R$ knows in advance, and that even they can only influence by choosing wether or not to make $R$.
% TODO: Should this be in AnV-1-setting.tex and shoud that say more about VRFs? 

We run the initial approval checker assignments that apply to a candidate receipt $\receipt_B$ based on $\hat{\omega}_R$ where $R$ declares $\blobB$ available, as well as the parachain $\rho$ of $B$.  In this way, we minimize adversaries' influence the $\hat{\pi}_R$ used.  In fact, adversaries could always withhold availability reports until their planned $\hat{\pi}_R$, but delaying approvals sounds more complex than delaying backing votes.

An adversary wants to influence $\hat{\pi}_R$ to give their $B$ at least $\ell??$ dishonest approval checkers.  We determine assignments using the assigned parties' VRFs, so adversaries cannot prevent honest checkers being assigned too, but if they force many low delays then approval checkers with higher delays never check.

% Implementations have two options for these initial approval checker assignments, either repeatedly employ the one-shot scheme described above, or else reduce variance among validator workloads with a packed scheme described below.

\subsubsection{Initial one-shot:}

We describe using the one-shot self-assignment subprotocol:  All validators run the one-shot self-assignment subprotocol for each parachain $\rho$ declared available in $R$ using the seed $z_B := \hat{\omega}_R || \rho.\mathsf{seed}$.\footnote{We write $z_B$ instead of $z_{R,\rho}$ because parachains have only one block declared available per relay chain block.}  We outline computing the associated target number $\ell_B$ of eventual checkers for the one-shot assignment subprotocol: 

As discussed previously, our approval phase begins only after acquiring enough preliminary backing parachain validators from $\rho$.  We suppose here that $\npvals$ preliminary backing checks to always be realistic, making fewer than $\npvals$ preliminary backing checks suspicious.  We set $\ell'_B$ to be the number $|\vals_\rho|$ of parachain validators for $\rho$ minus $|S \cap \vals_\rho|$. 
% TODO: \npvals vs |\vals_\rho|

We set the no-show suspicion factor $\theta'$ in the range $0 \le \theta' \le 2$, as noted above.
In additions, we accumulate several dynamic report factors, like an unavailability report factors $\rfuv$ from validators and $\runav$ from collators, and an invalidity report factor $\rinv$ from both collators and untrusted fishermen.  We set the target number $\ell$ of approval checkers to be the sum of these, along with any missing preliminary backers $\npvals-\primarychecks$.  So
$$ \ell_B := \basesecondarychecks + \lceil \theta' \ell'_B + \rfuv + \runav + \rinv \rceil \mathperiod $$

Implementations might slightly delay accounting for $\theta' \ell_\rho$, and possibly other factors, to prevent $\ell$ from being overly large initially. 

\subsubsection{Initial modulo:}

In one-shot schemes, all validators have separate $z_B$ for each parachain $\rho$ represented in $R$, which results in statistically independent VRF signatures $\hat{\pi}_{V,B}$ for each $\rho$, which remain secret until $R$ declares $\receipt_{B,S}$ available.  
Independence here protects against adversaries arranging that all checks of $B$ be run by their own validators.  

As a cost to independence, our candidate $B$ might receive too few or too many approval check assignments.  We partially address the too few case with our delay system, which improves the too many case indirectly, but delays admits some adversarial manipulation.  We avoid more complex prioritization rules because reducing independence between validators admits adversarial manipulation.

In the same vein, any individual validator could similarly receive too few or too many approval check assignments.  We gain little from individual validator's initial assignments being independent across parachains.  We therefore sacrifice independence across parachains for a more balanced distribution.

We ask that each $V$ announce one $\tilde{\pi}_{V,R} := \VRF.\Sign_{V.\sk}(\hat{\omega}_R)$ via gossip for $R$ as soon as it considers $R$ to be available.  In other words, we make $\hat{\omega}_R$ alone the seed to obtain an output $\tilde{\omega}_{V,R} := \VRF.\Out(\tilde{\pi}_{V,R})$ associated to $R$, not any particular $B$.  

Now let $n_A$ denote the number of availability cores.  Also let $A_R$ be the map from $\{ 1,\ldots,n_A \}$ to parachains declared available in $R$.  We assign $V$ to check the parachain $A_R( \tilde{\omega}_{V,R} \mod n_A )$ with a delay of zero.  

In other words, any validators $V$ assigned by their $\tilde{\pi}_{V,R}$ check before any nodes assigned by their delay.  We should retain the delay checkers with a delay of zero, primarily because we cannot prevent availability cores from going unused, so that $A_R$ returns nothing for most checkers.  

All nodes should announce their $\tilde{\pi}_{V,R}$, and then run their assigned check.  Any abdications signal nodes being offline as a useful side effect.


\subsubsection{Equivocation:}

% TODO:  Alistair, Jeff, and Handan must discudd this part carefully to set \equivocationchecks.  I preserved that equivocation checks use parachain block hashes, not relay chain block hashes, from Handan's write up, which make sense because more equivocations does not require more checks.  It makes (3) from her write up make no sense.  Also her (4) used #Vcehck in a strange way. 

We say two distinct relay chain blocks $R$ and $R'$ are {\em equivocation} for one another if they use the same VRF output $\hat{\omega}_{R'} = \hat{\omega}_R$ for the relay chain randomness.  We say $R$ has an equivocation $R'$ in this case, or just has equivocations to merely express the existence of $R'$.
%
We distrust relay chain blocks with equivocations of course, and must mark them as having equivocations, but we sadly cannot invalidate them because doing so creates attacks on finality. 
% TODO: Explain?

We therefore fear an adversary might create an $R'$ with an innocent $B'$ on $\rho$ merely to learn the approval checkers determined by $\hat{\omega}_R$ for $\rho$, but then equivocate with another $R$ that contains a malicious $B$ on $\rho$.  In this scenario, our equivocation $R$ triggers exactly the same approval checks for $B$ as $R'$ did for $B'$, which wrecks our advantage over the adversary.

We say a candidate receipt $\receipt_{B,S}$ has {\em equivocations} if a relay chain block $R$ that declares $\receipt_{B,S}$ available has an {\em equivocation} $R'$ that declares a (conflicting) candidate receipt $\receipt_{B',S'}$ available.

An equivocation among candidates requires both a relay chain fork and a relay chain equivocation, which makes them extremely suspicious.  We thus want additional checks that remain unforeseeable to our adversary.  
%
At the same time, we cannot prevent validator operators from running poorly designed custom key management infrastructure, either to mitigate internal threats or for high availability, which makes relay chain equivocations and even candidate equivocations plausible.  We should therefore address them without triggering excessive checks against innocent parachain candidates under repeated equivocations.

We find this compromise as follows:  If $\receipt_{B,S}$ has equivocations, then we instantiate the generic one-shot self-assignment subprotocol with seed $z_B = H(B)$\footnote{If desired, replace $H(B)$ by the Merkle root $\merkleroot_B$.}, so that each parachain candidate gets assigned only an extra checkers group only once.  In other words, we work with VRF signatures $\tilde{\pi}_{V,B} := \VRF.\Sign_{V.\sk}(H(B))$ and do verifiable sampling with the VRF outputs $\omega_{V,B} := \VRF.\Out(\tilde{\pi}_{V,B})$, as opposed to the $\tilde{\pi}_{V,R,\rho}$ or $\tilde{\pi}_{V,R}$ from our initial checks.

In this one-shot application, we could choose our target number $\ell$ of approval checkers to be a constant $\equivocationchecks$, perhaps equal to $\basesecondarychecks$, or at the other extreme reuse our target number $\ell_B$ of approval checkers as defined for the initial one-shot variant.  We suggest however that report factor from $\ell_B$ by reconsidered under the equivocations attack scenario.

In this way, an adversary who equivocates always faces $\ell$ extra checks against their new parachain block $B$, without impacting the innocent parachains too much when faced with numerous equivocations.

Implementers might suppress this check if parachains always carry less value than gets slashed when relay chain block producers equivocate.  Attacks could impact multiple parachains simultaneously but doing so requires significantly more resources.

\subsubsection{Rewards}

We reward validators for both their assignment announcements as well as actually running their assigned approval checks.  Incidentally, the initial modulo assignments strategy produces more uniform rewards than the one-shot strategy, although amortized rewards work out similarly.  

We do not slash validators for not delivering their assigned checks, but ... 


\subsubsection{Finality} 
\label{sec:finality}

As noted above, all validators accumulate approval checks in the checks $S$ of an attested candidate receipt $\receipt_{B,S}$.  All our assignment schemes come with applicability criteria, like $R$ having equivocations, as well as fulfilment criteria, like $|S_z| \ge \ell + \theta \ell'$.  

All validators would eventually get every approval assignment, which ensures we have enough assignments, and that no-shows eventually get replaced.  As a result, any individual validator faces several obstacles to approving $R$, either
\begin{itemize}
\item $R$ contains an available but invalid parachain candidate receipt $\receipt_{B,S}$, which results in slashing its backers in $S$, or else
\item $R$ contains an unavailable candidate receipt $\receipt_{B,S}$, which eventually redirects block production in \S\ref{sec:unavailability}. 
\end{itemize}

We impose one further assignment scheme mentioned previously in \S\ref{sec:parachain}:  If any two relay chain validators ever give opposing validity attestations for some candidate $B$ included in $R$, then all relay chain validators should produce approval validations for $B$.  In this case, we accumulate votes until $f+1$ claim validity or invalidity, and then slash the loosing side.  We cannot slash if neither side reaches $f+1$, but we still declare the block invalid in that case.  We expect governance to identify software faults and manually revert slashes they cause, but governance can also manually institute slashes in this second case, or manually slash $\para$ for offenses like malicious code or improper non-determinism. 

We judge $R$ approved when all applicabile assignment schemes appear fulfilled.  Any validator $V$ judges $R$ to be {\em strongly available} when $2f+1$ validators including $V$ itself announced availability claim for every candidate in $R$, meaning they claim they possess their own piece from $\pieces_B$.  An approved and strongly available relay chain block $R$ becomes eligible for consideration by our finality gadget GRANDPA.


